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Process and apparatus for automatically indexing 
documents of a set of computers of a network 



Technical field of the invention 



The invention relates to telecommunications and more particularly to a 
process for automatically indexing files and documents associated with computers 
10 connected to a network. 

i 

! 

Background art > 

! 

The development of computers and Information Handling Systems (I.H.S.) 
15 continuously increases the volume of information which is created, processed and 
stored within computers. Every user is now faced with the difficulty of managing this 
considerable information and the great number of documents stored within his 
computer and for retrieving particular files when he wishes to do so. 

20 Software programs exist in the art for indexing the files of a computer for the 

purpose of facilitating their access to the user Generally speaking, those solutions 
are based on a systematic scanning of the different files and specifically the 
particular documents containing user's data for the purpose of extracting relevant 
words and items which can serve as a direct access point to i|he individual files to 
25 which they refer. j 

I 

As the indexing process involves the successive scanning of all the 
documents stored within a machine, such a process requires a non-negligible 
amount of processing resources at the level of the individual machine. This may 
30 hinder the use and the generalization of the indexing technique on the end user's 
computer. j 

I 
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In addition, most computers which are used in the environment of a company 
or a private organization are now connected to, or constitute, An example of such 
networks are referred to as Intranets. In such a corporate environment the 
distribution of and access to enterprise knowledge takes on particular importance 
5 and it is clear that the indexing operation should not be retained at the individual 

i i 

; | level of the end user of the computer but at the level of the network manager, e.g. 

| the Information Technology (I.T.) Administrator. 

i 

I Because the Information which is continuously created, processed and stored 

10 within the network of a company has increased in importance, 'the IT Administrator 

I i now receives, in addition to his traditional remit, the task of preserving and indexing 

the documents of a corporation. It is also usually the responsibility of the IT 
Administrator to manage security issues raised by these j particular type of 
intellectual assets. 

15 

I It is therefore essential that the IT Administrator be given technical tools 

which facilitate, on one hand, access to safe and/or sensitive information for 
authorized users while preventing, on the other hand, any misuse of that 
| information. 

!20 

The problem to be solved by the present invention l is to facilitate the 
incorporation of the indexing processes and techniques wljiich are particularly 
adapted to a corporate environment for instance, while minimizing the processing 
j resources required at the level of the local machine. i 

! i 

i 

Summary of the invention 

i 

i : 

30 In one aspect the invention provides for a process for indexing files residing on a 

computer, characterized by the steps of: i 
j - executing one or more periodic backup operations on the files, said backup 

operation including the step of scanning the files; 

i 
i 
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- using said scanning operation to derive a set of itemized indexes for 

! 

subsequent use in obtaining direct access to said files. \ 



The process preferably executes a periodic backup of the system and/or user 

i 

files, wherein preferably the user files are indexed. 

During the backup operation of the user's document, theprocess may index 
the flies for the purpose of creating a set of itemized indexes w,hich can serve as a 
i o set of access points to those files. j 

A scanning operation may be used for both generating trie signature of a file 
and for extracting the key words and indexes for that files. 

15 : This provides an indexing process which is well adapted to a corporate 
environment and which allows the creation of a centralized indexing system allowing 
storage and indexing of documents on a network while minimizing the processing 
resources required by the end user computers attached to the network. 

20 It is a farther object of the present invention to provide a network 

indexing system which is well adapted to achieve networked knowledge distribution 
wh'ile preserving the security of the documents that are indexed! and prevent the un- 
authorized access to the indexed documents. \ 



25 The process can be used for indexing a wide number of documents, including 

WORD ™ files, as well as compounds files such as emails, .cab files and the like. 

! 

By using the same scanning operation for the backup and indexing 
procedures, access to the files may be optimized as can be the amount of 
30 processing resources required for the backup and indexing operations. In addition, 
the backup and the indexing operations can be readily jand simultaneously 
automated without requiring an additional intervention from the user. 



HP 50002133 



It can be seen that the process Is particularly adapted for use in network 
environments and for providing a centralized index of all the documents available 
within said networks. 



i5 



Each local computer which is connected to the network may incorporate a 
Backup and Indexing agent which is adapted to substantially simultaneously 
perform a backup of the files - including the user's personal files - and the indexing 
of said files by a Backup and Indexing server communicating with said network. 



10 



15 



20 



| In the corporate environment, the user is unaware of the' indexing operation. 
Further, the IT administrator is given the technical tools to manage the intellectual 
assets of a given company by simultaneously controlling the backup and the 
indexing process at the server. 

i 

In a preferred embodiment, the Backup and Indexing server incorporates a 
centralized index which allows direct reference to and access from a local computer 
to documents available on the network, as well as a local indexes which may be 
transmitted back to the local computer. 

Preferably, at least one indexing attribute is associated with each file for the 
i i 
purpose of controlling the indexing process executed by said Backup and Indexing 

server. 



! J The indexing attribute may employ an Access Control List (A.C.L.) such as 

\i5 that which is available in WINDOWS ™ NT-type or UNIX type machines. 



Preferably, the indexing process is executed by means 



associated with a centralized database for storing the backup files. 



of a server which is 



task of indexing the 



30 Therefore, the local computer is not burdened with the 

files, and the full processing resources of the local machine are available for the 
user. Further, since the server compiles an overall index of all the files stored within 
the different machines of the network, it can be seen that the whole set of files 
forming the knowledge-based assets of a company or a private organisation can be 

4 

HP5OO02133 



Printed:31-01-2001 



\ 



stored within a centralized database and become accessible, via an unique indexing 
table, to the users of the network. 



In a further embodiment, the server and the database of backup files and 
documents may be located outside the Intranet network, and the size of the 
software code of the agent may be substantially minimized by means of the Hyper 
Text Transfer {H.T.T.P. or the secure version H.T.T.P.S) or File Transfer (FT.P.) 
protocols. 

In yet a further embodiment, a signature is computed for each individual file 
or document for the purpose of determining whether said pie or document is 
already loaded within the database of backup files and whether it has been included 
within the table of indexes. 



Preferably, each file or document which is to be backec 
allocated a specific attribute which is used for controlling the 
that file. By use of that attribute, each individual user who crea 
full control of the indexing process executed in relation to that file, and therefore 
the files referenced within the table of indexes. I 



up and indexed is 
indexing process of 
es a file may retain 



The invention also provides for a knowledge-base system adapted to 
automate, at the same time in a manner of which the user is u naware, the periodic 
backup and indexing of a user's documents stored on the computers of a network. 

The invention further provides for a process which isladapted to carry out an 
enhanced backup system, preferably by means of a software program for a stand- 
alone computer, the process including the steps of opening eajch file which is to be 
backed up and, during the same operation, compiling a set of indexes representing 
that file for the purpose of adding to a table of indexes thereby allowing direct 
access to said user's documents. 

In yet a further embodiment, the invention provides for a computer or network 
of computers adapted to carry out the method as hereinbefore described 
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An exemplary embodiment of the Invention will now be described by way of 
example only and with reference to the accompanying drawings in which: 

Figure 1 illustrates the architecture of different computers attached to an Intranet 
network; 

Figure 2 is a drawing showing the initialization of the backup & indexing process; 

i 

Figures 3 and 4 illustrate the periodical backup and indexing process; and 

Figure 5 is a flow chart of the search process into the local land the centralized 
indexes. 



Description of the preferred embodiment of the invention 



With respect to figure 1 there is shown the architecture of a corporate 
environment which can particularly take advantage of the backup and indexing 
process which will be described below. An intranet network includes a first sub- 
network 10 and a second sub-network 20. First sub-network 10 includes computers 
1 and 4, a server 2 and a router 3 which is used for the direct connection to sub- 
network 20, the latter comprising a computer 11, a printer 12, router 13 and a server 
14. The intranet network communicates with the Internet network 70 via a proxy 30. 
A firewall arrangement 80 may be used for securing the exchange of communication 
between the internet network 70 and the Intranet network. As known by the man 
skilled in the art, a firewall is generally based on two distinctive servers: a first one 
collecting the information received from the Internet and which is to be forwarded 
inside the Intranet and a second server which is used tor requests originating from 
the Intranet and which are to be forwarded outside the Intranet. The arrangement 
and operation of a firewall is well known to the skilled man ; and will be not be 
discussed further. 
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Each computer, such as computer 1 , incorporates a Backup & Indexing agent 
for executing a backup procedure with respect to the files of the user's computer. 
This may include the system files and the documents containing user's data. In the 
preferred embodiment, the Backup & Indexing agent periodically collects a copy of 
the files which were created or modified during to the last backup operation. More 
particularly, an external server 50 is associated with a backup database 60 for 
storing the backup files and documents from all the computers! and systems of the 
Intranet network. i 



Figure 1 shows a server 50 with a backup database 60 that is located outside 
the boundaries of the Intranet network, and which can be accessed from the Intranet 
via the Unrform Resources Locator (U.R.L). It is considered that the skilled man 
'can readily adapt the process which is described below for the purpose of storing 
15 the backup files within a database and a server located within the Intranet, for 
instance server 2 or server 14. 



The exemplary description below will elaborate in more detail the case of the 
backing up the files and documents of the network within the external server 50 and 
database 60. 



There will now be described how the backup procedure can be 
advantageously adapted and combined with indexing techniques for the purpose of 
allowing an effective backup and indexing solution adapted to a corporate 
environment The procedure may implement the backup process which is 
specifically described in European patent application C041 0062.4 entitled 
"Automatic Backup/recovery Process", the disclosure of which is herein 
incorporated by reference. 

The backup process which is described below is based on the successive 
transmission of a copy of the files and documents of the computers of the network to 
external server 50 via the firewall 80. Each document or file which is to be backed 
up is analysed in terms of object, and is transmitted with an ocjec^ identification, an 
object attribute including a specific set of indexing attributes, an object signature and 
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an object content. Once transmitted to f and received by, server 50, the documents 
are stored within database 60 in order to form a backup data set, which comprises 
the description of all the files, the attributes, the directories, and labels. This data 
constitutes a saved volume. Each stored object consists of an image of a backup 
object of the original configuration of said volume, and which is to be stored within 
the database 60. As it will be shown below, the identification, the attributes and the 
signature are used for uniquely comparing a stored object with a backup object. 
Additionally, the contents may be used for rebuilding an object which is saved from 
a previous backup. 



the backup objects 
from the Hypertext 



Practically, it has been shown that the transmission of 
may take substantial advantage of the FTP and particularly 
fTransfer (HTTP - or its secured version HTTPs) protocoL Such an arrangement 
(entails two substantial advantages. The first results in a simpler design of the agent 
jcomponent which can exploit the HTTP protocol and transmit, potentially in a 
secured fashion, the different backup documents through the jntranet and internet 
network, to the server 50. Additionally, by encapsulating the different backup objects 
which were defined above into HTTP POST requests, the backup objects can be 
reliably conveyed throughout the network even where a firewall system has been 
implemented in order to secure the Intranet In particular, no adaptation of the pre- 
existing firewall system settings are necessary and the backup process can be 
immediately executed and applied, at no additional cost. This results in a 
substantial advantage as the skilled man is aware that, in most cases, the 
adaptation of existing firewall parameters can be a complex and costly operation. 
The process which will be described below achieves an effective backup procedure 
without specific adaptation of the pre-existing network configuration. 



j The backup and indexing process involves an in'rtializatiDn procedure for the 
purpose of creating a first set of backup files and documents stored within database 
60. The initialization procedure may be launched in response to a request from the 
user. In one embodiment, the backup and indexing agent may o e pre-installed in the 
local computer and be represented by a corresponding icon on the Desktop. This 
can be used to launch the initialization procedure. Alternatively, the Backup & 
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ndexing agent can be downloaded from backup server 50 when the user accesses 
ie latter via his browser. 



L 



With reference to figure 2, the initialization procedure starts with a step 21 



which corresponds to a compilation of an exhaustive list 
documents residing on the local user machine. 



of the files and/or 



In step 22, the Backup & Indexing agent initiates remote access to the server 
50 and transmits the list of system files and user documents to the server 50 For 
instance this may be by means of the HTTP protocol such as a HTTP POST. Other 
protocols can be used such as File Transfer Protocol (F.T-F .), the Network File 
System (N.F.S.) approach or similar models of network file systems- In the case of 
the H.T.T.P. protocol, the secure version of the latter may be particularly 
appropriate. 

In step 23, the Backup and Indexing agent transmits to tie remote server 50 
a copy of each file and document, including the attributes In addition to the 
standard attributes which are known, for example, in the context of the WINDOWS 
™. NT-type or in Linux operating system, the Backup and Indexing agent transmits 
at least one additional attribute which is used for the purpose of controlling the 
indexing process executed in the server. As an example of an indexing attribute, the 
skilled man can use of the Access Control List (A.C.L.) known in relation to the 
WINDOWS ™, NT or UNIX typej operating systems. 

In one embodiment, a 'first indexing attribute is usee" for controlling the 
indexing process of the considered document and the incorporation of at least one 
reference to that document within the centralized index which is maintained by 
server 50. 



In an alternative 
second indexing attribute whicr 
the search process, selective 
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embodiment, the first indexing attribute is associated with a 
may be used for more precisely controlling, during 
access to the documents stored within database 60. 



The process is designed for analyzing a wide variety of different user's 
documents, including text documents such as WORD ™, WORDPERFECT ™ 
pFRCE ™ documents etc...., as well as compound files which ijnight include textual 
.information. The analysis of the different files can be based upoln an examination of 
( the filename extension of the document files by the Backup & indexing agent on the 
local machine. 



When ail the files and documents are transmitted 
initialization process terminates by means of step 24. 



to server 50, the 



With reference to figure 3, there will be described now ihe periodic process 



|which is executed for carrying out the simultaneous backup 
user's documents 



and indexing of the 



The process is initiated with step 31. This can be performed by means of a 
system scheduler mechanism, such as the Sleep function which is known for 



system. In another 



instance in relation to the WINDOWS ™ NT-type operating 
embodiment, it may be possible to start the backup upon the request from the 
user. 



In a step 32, the Backup and Indexing agent initiates 
50 and a HTTP "GET" request for the purpose of obtaining a 
remote data set of the backup documents which are stored with 



remote access to server 
epresentation of the 
n the database 60. 



In step 33, the server 50 transmits the list of the backup files and documents. 
In one embodiment, the information is transmitted by means of an XML file which 
contains a table with the list of the backup files and documents, including the 
identifiers, the attributes and the signatures. While this step is not absolutely 
necessary, since it is possible to keep a local image of the date set within the user's 
machine, it has been found to be useful to retrieve the remote data set which is 
actually stored within the backup server. 



In addition to the list of backup files and documents, the server 50 transmits a 

Typically, this index 



local table of indexes of the documents in the local machine 

10 
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takes the form of a table which provides, for each itemized reference, a list of the 
relevant documents with the paths for permitting a direct accesjs. The local table of 
indexes will be used during the search process carried out by the Backup & 
Indexing agent when the user will execute a search using his machine. 

I 

In step 34, the Backup & Indexing agent receives that information from server 
50 and stores it in the local machine. 

In step 35, the Agent performs a local analysis of the user's configuration and 
identifies all the backup files which are representative of that configuration, it then 
establishes a local data set of backup files and documents, including the identifier, 
the signature, the attributes and particularly the indexing attribute (s). It should be 

agent may create a 



noticed that, for the purpose of computing the signature, the 



copy of the considered object, after having locked access to the latter. 

In step 36, the Agent then iteratively processes each backup file or document 
which was identified within the local data set of backup objects. 



5 In step 37, the process determines whether the 
Jhas the same identification on the remote data set transmitted 



considered file or document 
by the server 50. 



If the answer is yes, then the process checks at sjep 38 whether the 
■signature of the considered backup object appears to be the same than that which is 



; reported in the remote data set. If this is the case, the considered object appears to 
;be unmodified, and the process then proceeds with step 39 which loops again to 



step 36 for processing the next file or document within the list o 



the local data set. 



If the tests of step 37 or 38 have failed, the process proceeds with the 



I transmission of the considered backup file to the server 50 
achieved by means of an appropriate HTTP s POST request 



object, including the identifier, the attributes, the contents and the signature. It 
should be noticed that, for the purpose of computing the signature of an object and 
processing it, the backup agent may advantageously create a local copy of the 
considered object, once it has been locked. As soon as the local copy is made, the 

a 
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in step 40. This is 
with the considered 



15 



20 



25 



original object can then be unlocked and the Agent may compute the signature on 
the local copy. This ensures that the considered object does not remain locked too 
long. ; 



In the preferred embodiment, the backup and indexing agent incorporates a 

means for processing the compound files for the purpose of extracting from those 
| { 
jthe different objects and computing their signatures for the purpose of processing as 

explained above. This permits the processing and transmission, where necessary, 

of the Individual components of compound files, for the purpose of reducing the 

10 amount of data to be transmitted through the network. As known by the skilled man, 



such compound files include .eml, >avi, .wav, .riff, .zip files. In one embodiment, the 
backup technique may further use differential backup end/or compression 
techniques for the purpose of reducing the volume of the data 
the server. 



to be transmitted to 



(t can be seen that that the use of the HTTP protocol allows a substantial 
reduction in the size of the software program necessary fcjr implementing the 
Backup & Indexing agent, since.it is the HTTP protocol, and particularly the secured 



jversion HTTP s which handles the main parts of the transmission process. 
lAdditionally, since the HTTP protocol is able to be readily interpreted by the firewall 
procedures which the IT Manager may have arranged for securing a network, the 
backup procedure may be readily applied within a corporate organization, and an 
Intranet network. 



1 



With respect to figure 4, when all the backup files and documents have been 
processed, the loop terminates and the Backup and Indexing Agent transmits at 



step 41 , the list of the local set 



server 50 receives that local date set and then launches a loop for processing all the 
files and documents contained within the remote data set. For jeach object which is 
30 'identified within the remote set of data, the server checks whether the considered 
'identification exists in the local c ata set, in which case the process loops back to the 



(next object identified within the 
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remote data set. However, if the file or document 



! appears to be no longer reported within the local data set received from the backup 
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agent, the server erases the latter from the remote data set and 
of that object within the database 60. 



ijeletes the contents 



For any new or modified document, an indexing process is launched in a step 
, 5 42 and controlled in accordance with the value of the indexing attribute assigned to 
that document. ! 

i 

In step 43, the server updates the centralized index containing the reference 

as the local index. 



to all the documents existing within the Intranet network, as well 

i 

In step 44 the server transmits to the Backup and Indexing Agent in the local 
machine the revised version of the local index which was computed. That local 
index will be used in a search process for a document which will be described 
hereinafter. 

The Backup and Indexing Agent stores the local index at step 45; this 
completes the periodic backup and indexing procedure. 

It can be seen that the technique modifies and extends known backup 
Iprocedures which are traditionally used for creating a backup database by 
jautomatically and in parallel compiling a set of indexes which can be stored within a 
[centralized database. The process may then use that centralized index, in 
association with a search engine, for automatically retrieving the documents stored 
within the database of backup files and documents, whatever the types of 
documents being considered: for example HTML, WORD ™ 
ifiles. 



or even ADOBE ™ 



The two processes are combined in such a way as Id permit systematic 
I scanning and indexing of the files located on a machine, for the purpose of 
30 constructing an index table of the files . Further, by combining the backup and the 
indexing facility in the same entity, i.e.; server 50 , the user's computer resources 
remain fully dedicated to the user. This represents a substantial advantage. 



13 



While the process is particularly adapted for use in a corporate environment, 
t should be noted, however, that the process can be readily adapted for use with a 
stand-alone computer for permitting a simultaneous backup and indexing of the files 
ocated in that computer. 



The process may also be readily adapted to the WINDOWS/NT-type, or 
LINUX operating system where attributes and rights exist for each file. 
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With respect to figure 5,1 there will be discussed now the process which is 

a search within the 



carried out by Backup & Indexing agent 
index that has been compiled previously. 



In step 51, the Backup and Indexing agent receives a request tram the user. 



In step 52, a first local search is 
was received from server 50 in step 44 



when the user starts 



teing 



conducted on tlfie local index which 
of (figure 4. 



In step 53, the local search is completed, upon request from the user, by 
means of an extensive search within the ( antralized index elabc rated by server 50. 



I 



In step 54, the server 50 prepares 
| accordance with the value of trie second 
access. In one embodiment, the server can produce a HTML page containing a list 
of links allowing access to the docume its. More particularly 



a list of documents which are presented in 
indexing attribute cor trolling the selective 



for the citations of 

documents having a selective access attributes, the user who has requested the 
search is made aware of the existence of one citation within the centralized 
database but he may not have a direct access to that documen 



If the user wishes to access 
indexing attribute, the process automat 
automatically transmitted to the originato of 



one document having 



:ally prepares an electronic mail which is 
the considered document in step 55. 



In response to 'the originators 
allows the access to trie requester in ste; 
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ag eement, the server 50 then automatically 



56. 
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Thus the present invention facilitates the incorporation of 
and techniques, in a way which reduces or eliminates the use 
resources. This may be particularly useful in the context.of a corporate 
where it is generally desirable to minimize the impact of 
processes, on the performance of a local machine. 



indexing procedures 
of local user-based 
environment 
backup, or related 



example and with 

reference to particular embodiments it is to be understood thatj modification and/or 
improvements may be made without departing from the scope of the appended 
claims. 



Where in the foregoing description reference has been 
elements having known equivalents, then such equivalents are 
as if individually set forth. 



Although the invention has been described by way o 



made to integers or 
herein incorporated 
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1. A process for indexing files residing on a computer, characterized by the steps of: 

- executing one or more periodic backup operations on the files, said backup 
operation including the step of scanning the files; 

- using said scanning operation to derive a set of itemized indexes for 
subsequent use in obtaining direct access to said files. 



2. An indexing process as claimed in claim 1 characterized in that both text 
processing files and compound files are analyzed and indexed. 

3. An indexing process as claimed in claim 1 or 2 characterized in that it is 
implemented in a centralized environment where a server (50) is associated with a 
database (60), said database adapted to store backup files and wherein said server 
substantially simultaneously carries out the backup and the inde xing of the files. 

4. An indexing process as claimed in daim 3 characterized in that said server (50) 
indexes files residing on a plurality of ^computers attached to, or constituting a 
network for the purpose of generating a c entralized table of indexes loaded on said 
server (50). 

5. An indexing process as clained in clajm 4 wherein access lights are defined for 
each file including at least one indexing right that is used for co itrolling the indexing 
process of the files within said centralized! table of indexes. 

6. An indexing process as claimed in claim 5 wherein the at lesst one indexing right 
includes: a first indexing attribute which authorizes the indexing of a given file within 
the centralized index; and a second indexing attribute defining selective access to 
that document. 
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7. An indexing process as claimed in claim 6 wherein after completion of the backup 
of files residing on a first machitjie, said server (50) transmits toj the first machine a 
local table of indexes representative of the different documents stored that first 



machine. 



J. An indexing process as claimed in any one of claims 3 to 7 wherein transfer of the 



iles which are to be backed up ijises the hjyper Text Transfer (H 
or the like protocols. 



9. An indexing process as clai 
correspond to system and/or 



ined in ar y one of claims 1 to 



usesr 



10. An indexing process as 
relation to the user files. 



claimed in els im 9 wherein the indexing is performed in 



led in 
files. 



T.T.P.), RCP.FTP 



8 wherein the files 



11. A process for searching for a file witht \ a set of indexed files, said files stored on 
a plurality of computers connected to, or constituting, a network, the files being 
indexed in accordance with the Indexing process as claimed in claim 6 or 7, 
characterized by: 
- initiating a search request foi 
words or indexes; 



a given file] said request confining a set of key 

: 

- processing said search request by reference to a first local table of indexes stored 
on one of said plurality of comauters in jorder to locate a first set of relevant files 
extracted from said one computer. I 

- processing, upon request from the user, an additional 
centralized index loaded into said server for the purpose of obtaining any additional 
results corresponding to files stored on the backup database (60) 

- displaying the result of said additional search and, for each 
selective access attribute, automatically generating an electron! z mail to be sent to a 
corresponding originator of said file for the purpose of requesting access to said file. 



12. An apparatus comprising program code elements for carrying out the process as 
claimed in any of claims 1 to 



search within said 



or any file having a 
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steps of any one of claims 1 to 1 
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13. A computer program product comprising computer program code stored on a 
x>mputer readable medium adapted, when executed on a computer, to perform the 
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14. A knowledge-base system comprising: 

- means for regularly backing up files stored on computers connected to or 
constituting a network; 

- means for substantially simultaneously indexing the files during the backup 
procedure for the purpose of creating and updating a database of backup files and 
documents as well as a centralized index of backed up documents. 



15. A backup process for a stand-alone 



imputer characterized 



- opening each file which is to b* backed tap; I 

- white opening said file, compiling a sol of indexes characterizing said files and 
which will be incorporated into a table of iidexes; 



- closing said file upon completion of said 



16. A computer programmed to operate 
of claims 1 to 11 and 15. 



ji accordance with the process of any one 



by: 



backup and said Indexing operation. 



17. A computer network adapted to opefete in accordance wrtlp the process of any 
one of claims 1 to 1 1 and 1 5. [I 
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Abstract 

Process and Apparatus for Automatically Indexing 
Set of Computers of a Network 



4 



Documents of a 



A process for automatically indexing the documents stared in a computer 
nvolving the step of executing at regular interval a periodical backup operation of 
he system files and the user's documents. The backup operation is based on a 



scanning of all the files for the 



purpose of computing a signature, and the same 
aperation is advantageously ussd for elaborating an index of t^e user's document 
stored within the computer. Preferably, the invention is used in a network 
environment and the backup and indexing operations are earned out by a server 
which takes advantage of the internal synergy between the backup and the indexing 
operation for the purpose of Elaborating a centralized index of the documents 
available in the network, whichj documents could be retrieveoj from the database 
associated to the backup process. Access control rights are used for controlling the 
indexing process and for defining selective access to said documents. 
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to server 50 
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Backup & indexing 
list of backup files 


agent requests 
to server 50 



Server 50 transmits list of backup 
objects, with local table 



ir 



Backup & service agent stores 
local table of Indexes 



Backup & indexing agent computes 
list of riles and documents in local 



For each file 
in local data set 
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att content & signature 
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Agent transmits local set of tiles 
and documents to server 



server control indexing 

in accordance with index att 



G 



server updates centralized 
index of network & local index 



server transmits revised 
version of local index 



Backup & indexing agent 
stores revised local index 



Fig. 4 
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server 50 
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